(19) 




Europaisches Patentarnt 
European Patent Offaee 
Office european des brevets 



(12) 



(H) EP 1 432 203 A2 

EUROPEAN PATENT APPLICATION 



(43) 


Date of publication i 


(51) Intel 7 HQ4L HH4M 7/OH 




23=08.2004 Bulletin 2004/28 


(21) 


Application number; 030253S7.9 




(22) 


Date of filing: 04.11.2003 




(84) 


Designated Contracting States: 


(72) Inventors: 




AT BE BG CH CY CZ DE DK EE ES Fl PR G8 GR 


• Shay, Gregory F. 




HU IE IT LS LU mC NL PT RO SE SI SK TR 


Mentor, OH 44080 (US) 




Designated Extension States: 


* Church, Steven 




ALLTLV&SK 


Avon Lake, OH 44012 {US) 


(30) 


Priority: 17.12.2002 US 433922 P 


(74) Representative: Henkel, Feller & Hansel 




03.04,2003 US 406396 


^dhlstrasse 37 






81675 gy&unohen (DE) 


(71) 


Applicant: TLS Corporation 




Cleveland, Ohio 44114 (US) 





CM 
< 

O 
CM 

CM 

3" 



chV" 

'ur. / 



(54) Low latency digital audio over packet switched networks 



(57) Method and Apparatus for delivering audio sig- 
nals from a source node to a destination node on a net- 
work. The apparatus uses a number of switches that 
transmit prioritized data on a packet network. The 
switches are coupled to a number of send/receive nodes 
for sending and receiving digital audio signals on the da- 
ta network. The audio packet size and the receive buff- 
ers are sized to store a minimum possible number of 
audio samples to minimize latency in processing audio 
signals arriving at said receive node, but still ensure au- 
dio delivery without interruption due to packet data net- 
work delay. An additional feature of the invention is re- 
covery of clock synchronization over the same data net- 
work by novel arrangement of transmission of timing 
packets on the network. By sending a multiplicity of 
packets at irregular intervals a minimum network transit 
delay can be determined by each of the receive nodes 
which allows the receive nodes to filter out packet net- 
work transit delay error and maintain accurate local 
clocks. 



AURSO GROUP 



AUDiD GROUP 2 



AUDIO PACKETS 

iiMMII 



7TTTTTTT 

AUDIO PACKETS 

.LLLLiLiJ. 



UNUSED NtSWOftK 
BAMDWOTH TIM£ 



AUDIO rUAME TIME 

(e.g. 250us) 



XT 





AUBiO GROUP 



AU0I0 GK0UP 2 



J.JL1.I.1I.I.JL 



MMnrz> 



j 



»' TRANSMISSION TiME OF 
THE LA&GE5T PACKET^ 120ws) 



AUDIO 3ROUP.N 



^"-130 



Rg.3 



UJ 



Printed by Jouve, 75001 PARIS (FR) 



SNSDOCSD: <EP„ 432203A2 J. > 



1 



EP1 432 203 A2 



2 



Description 

Cross Reference to Related Applications 

[0001] The present application claims priority from 
provisional application serial number 80/433,922 filed 
December 17, 2002. 

Field of the invention 

[0002] The present invention concerns digital audio 
and more particularly a low latency means of transmit- 
ting digital audio signals over a network having multiple 
connections or nodes. 

Background art 

[0003] Computer networks are defined by their struc- 
ture - bus, star, or some combination, and the organiza- 
tion of their bits - packets, continuous, or some combi- 
nation. 

[0004] Computer networks are aimost always packet- 
based. That is because data is naturally bursty. A lot of 
data flows when a user opens a web page, but while he 
or she is reading it there is no data moving. Packets also 
let a number of terminals share the same wire. 
[0005] In contrast, digital telephone networks are "cir- 
cuit-switched" - a circuit is open for the duration of the 
connection. These two styles are good matches to the 
two data types, but there has developed a need to mix 
them up, If one has only a telephone line and wants to 
connect to the Internet, the data packets must be for- 
matted and (usually) sent off to a modem. This works, 
but is inefficient because the line is held open and null 
data is being sent between the bursts of data that matter, 
if one wants to send audio over packet networks, the 
continuous audio data must be converted into packets 
and then the packets are reconverted into audio signals 
back together at the receiving end. 
[0006] Efforts to improve this cumbersome process 
make sense because: 

& computer networks are much cheaper these days 
than circuit-oriented networks owing to their ubiqui- 
ty and high-volume, 

m it is often desirable to have both audio and data si- 
multaneously on the same network, 

& and computers are now very often either the source 
or destination for audio signals, 

[0007] One example that illustrates a convergence of 
the two networks styles most clearly in the VOIP (Voice 
Over Internet Protocol) telephone application that is rap- 
idly gaining popularity. The idea is that only one cable 
is needed to connect both a PC and a telephone. The 
switch that makes this happen is a cheap commodity 
Ethernet switch rather than an expensive proprietary 
PBX. The cost benefit is significant. 



[0008] The same reasoning applies to the high-fidelity 
audio networks used in radio stations and other studio 
facilities, with their expensive PBX-Mke router switches 
at the core. Thus, the motive to use Ethernet for audio 
s transmission. 

Original Ethernet 

[0009] Originally, Ethernet networks were packet net- 

10 works, but by convention, Ethernet packets are also 
called frames, (not to be confused with the term audio 
frames used later in this application). These range from 
72 to 1528 bytes, depending on the amount of data to 
be carried. The original Ethernet was based on a single 

15 shared coaxial cable — - the Ether in Ethernets name. 
The very first versions used a 14" thick cable with phys- 
ical taps into it - one actually had to cut a little piece out 
of the jacket and screw in a metal part that made contact 
with the ground and center conductors. Later, the coax 

20 cable was smaller and T-connectors were used at the 
back of connected computers, but the principle re- 
mained the same. Even when Ethernet transitioned to 
telephone-style twisted-pair wires with acentral hub, the 
medium was shared in the same way. 

25 [0010] When a terminal was transmitting, it owned the 
full capacity of the cable. That means that there had to 
be some method to arbitrate access so that data from 
the various terminals didn't interfere with each other and 
that all had a chance to get on the bus and use a fair 

30 piece of the available bandwidth. This was done by the 
MAC - Media Access Controller — in each terminal. 
Robert Metcalf invented the method at Xerox PARC in 
1973. His mechanism senses when a collision occurs - 
collision detect Upon detecting a collision, both data 

35 sending terminals would choose a random back-off time 
and then retransmit their packets with a good probability 
for success. The system also included a listen -bef ore- 
talk function to reduce collisions - carrier sense. Using 
these methods, ail terminals could share access to the 

^0 channel - multiple access. Put these a\\ together and you 
understand why Ethernets channel access protocol is 
called a Carrier Sense Multiple Access with Collision 
Detect (CSMA/CD) 

[0011] United States patents no. 6,161,138, no 
45 5,761 ,431 , and no 5,761 ,430 are assigned to Peak Au- 
dio. The technology disclosed in these patents allows 
audio signals to be reliably sent over the classic shared 
Ethernets. One of the connected terminals is set to be 
the "conductor" and sends a synchronizing packet onto 
so the network that all terminals listen to. Then each termi- 
nal is assigned a timesfot on the network. The slots were 
offset in time with reference to the conductor's beat 
packet. Thatway, no collision or packet contention occur 
so that smooth audio flow is obtained. These patents 
55 describe the method of using a "beat clock" to control 
access to a shared network among audio terminals in a 
isochronous fashion so that each terminal puts its pack- 
ets on the network in a prescribed time slot. 



2 



3 



EP 1 432 203 A2 



4 



Switched Ethernet 

[0012] While the marketing name has been retained 
and there Is compatibility with the original Ethernet, mo- 
dem, switched Ethernet is a fundamentally different 
technology. With a dedicated full-duplex connection 
from each terminal and a central switch that routes traf- 
fic, Ethernet is no longer a shared medium system - and 
therefore does not need or use a Media Access Con- 
troller and the associated CSMA/CD scheme. Network 
Interface Cards used with Ethernet switches automati- 
cally disable these functions. 

[001 3] The aforementioned three patents that are as- 
signed to Peak Audio relate to the classic Ethernet CS- 
MA/CD architecture with its shared medium approach 
and do not mention switched Ethernets. Peak Audio is 
presently marketing an audio networking system under 
the designation CobraNet which is used over switched 
networks and may benefit from the switched Ethernet 
architecture because it may provide more aggregate 
bandwidth and thus more audio channels are possible. 
However, CobraNet does not use switched Ethernet ef- 
ficiently when audio and data share a link. Cobranet 
must route any data that shares a link with audio through 
their access module to ensure that it does not interfere 
with smooth audio flow. 

Summary of the Invention 

[001 4J The present invention takes advantage of 
switched Ethernet to transmit audio by means of a net- 
work to multiple nodes on the network. The invention 
provides: 

S Transmission of audio with no interruptions 

M Low latency in audio delivery 

ffl Implemented using off-the-shelf Ethernet switches 

m Audio signals share the network with data signals 

[0015] Broadcast studios have the requirement that 
disc jockeys be able to listen to themselves in head- 
phones. Maximum tolerable delay is around 15ms. 
There may be multiple Sinks in the microphone-to-head- 
phone path and maybe some processors, so each link 
has to have low delay in order to keep the cumulative 
effect below the threshold. Practice of the present in- 
vention comfortably achieves this latency requirement, 
The invention accomplishes reliability and low delay by: 

M Tagging audio packets with a higher priority value 
than data so network interlaces and switches can 
distinguish them and put the audio packets at the 
head in their queues or buffers. This is done on a 
per-packet basis, not by assigning particular Ether- 
net switch ports permanently to high priority so that 
a link may pass both high-priority audio and lower- 
priority data. 

§1 Practice of the invention never allows link capacity 



to be overfilled. Terminals are in control of the 
streams they transmit and also the ones they re- 
quest the switch to send them for reception. They 
have a function that calculates the Sink capacity, 

5 compares it to how much is already being used, and 
decides if there is enough space for more before 
connecting any new audio channel. This is in con- 
trast to normal Ethernet operation, which is a "best 
efforts" system with no way to limit offered data. 

10 m The invention uses a clock and PLL (phase lock 
loop) system to synchronize the audio bit-level 
transmit and receive clocks in terminals 

[001 6] These and other obj ects, advantages and fea- 
15 tures of the invention are described with a degree of par- 
ticularity in conjunction with the accompanying draw- 
ings. 

Brief Description of the Drawings. 
[0017] 

Figure 1 is a schematic representation of an audio 
network constructed in accordance with the inven- 
tion; 

Figure 1A is a schematic depiction of a packet 
switched Ethernet network; 

Figure 2 is a schematic depiction showing multiple 
data queues having different priority; and 
Figure 3 is a schematic of three timed buffer con- 
tents showing a means of reducing latency of audio 
packets received at a node; 
Figure 4 is a depicts a timestamp method of clock 
comparison and synchronization; 
Figures 5A and 5B are depictions showing methods 
of estimating probabilities of clock packets arriving 
with minimum delay; 

Fgure 6 is a histogram of clock packet time offsets; 
Figure 7 is an example of clock packet transmission 
designed to overcome Bursty Network Traffic Pat- 
tern on a network; 

Figure 8A, 8B and 8C are depictions of dock packet 
transmissions designed to overcome Isochronous 
network traffic; and 

Figure 9 is a block diagram of a node on the network 
of figures 1 and 1A. 

Best mode for practicing the invention 

so [001 8] Figure 1 is schematic depiction of a general ar- 
chitecture design of a network 1 0 that is used at a facility 
having multiple computers 12 and other audio equip- 
ment 14. The network 1 0 uses a switched Ethernet net- 
work for delivering both audio and data to any node 

55 (such as one of the computers 12) on the network, A 
node need not include ah entire computer but instead 
may simply be circuitry that includes a network interface 
circuit and an audio jack for plugging in a speaker, set 



30 



35 



40 



....1432203A2..1 ...> 



5 

of headphones, microphone or amplifier. Figure 9 is a 
functional block diagram of a typical node on the net- 
work 10. 

[0019] Key to implementing the network shown in Fig- 
ure 1 is the use of priority tagging and the action of Eth- 
ernet switches 22 (three of which are depicted in Figure 
1) that deliver higher priority packets first before any 
waiting lower priority (non-audio) packets. Another de- 
sign point is for each channel receiver (non-switch node) 
to have just enough audio data buffer to allow one full 
size (non -audio) packet to come through and not cause 
an audio dropout. The priority service action of the Eth- 
ernet switch will then guarantee that no further non-au- 
dio packets are allowed through until ail delayed pend- 
ing high priority audio packets are delivered. 
[0020] The Ethernet switches 22 shown in Figure 1 
operate in conformity with IEEE standard 802.1 Q-1 998 
and therefore recognize.priority bits in the header of data 
packages that are transmitted between nodes of the net- 
work. 

[0021] Referring to Figure 1 A, Packet Switched Net- 
works, in particular Ethernet, move groups of data, 
called packets (A), from senders(B) to receivers(C) over 
a shared network of communication media (wires, wire- 
less, fiber optic etc). Each packet A has information 
contained in it, called the destination address 24, that 
indicates which receiver C that packet is intended to go 
to. 

[0022] • Each of the senders and receivers includes a 
digital circuit for encoding and decoding packets as well 
as performing dock functions. Many packets may be 
sent into the network by many senders to any receiver 
concurrently, and packets proceed through the network 
to each receiver simultaneously. Intermediate nodes sn 
the network, acting as switches 22 forward packets on 
toward their intended destinations using the destination 
address 24 in each packet. The communication links (E) 
between nodes are used in common for packets of many 
different destinations. Since the communication capac- 
ity, called bandwidth, is finite on these links, each packet 
takes a certain finite amount of time to be transmitted 
across a link, which means that other packets in the 
switches 22 that need to go down the same link must 
wait for the previous packet to finish. Many packets wait- 
ing in the switch 22 form a queue (F), and the overall 
amount of time spent waiting in these queues is called 
queuing deiay or switching deiay Given a mix of many 
types of packets from many senders B to receivers C at 
many different times, means that these switching delays 
are generally not precisely predictable, and have a var- 
iable, chaotic, and even a certain amount of random be- 
havior. Note, that due to the nature of the network, a 
sender becomes a receiver and vice versa as data is 
transmitted to the various network nodes on the network 
10. 

[0023] This switching delay, its magnitude, its varia- 
tions, and its effect on the streams of data packets trying 
to flow through the Packet Switched Network is a prob- 



6 

lem addressed by the invention. 

Effect on Digital Audio Streams 

s [0024] For a data stream, such as an audio program, 
the digital audio data forms a sequence of packets Each 
packet represents a time ordered number of audio sam- 
ples. In order to correctly reproduce the audio program, 
the receiver C must output each audio sample in its cor- 
10 rect time order, at the set time interval in relation to other 
audio samples sent to a given receiver C. Any audio 
sample not output at the correct time, results in a distor- 
tion of the audio program, audible noise, and otherwise 
degrades the fidelity of the audio reproduction. There- 
*5 fore, in order to communicate a digital audio program 
made up of a stream of packets over a packet switched 
network, the effect of the above packet switching delay 
and its variations, must be dealt with by the invention. 

Low Latency Audio 

[0025] Latency is measured as the overall delay from 
the input of audio to the output of the audio from a node 
20 on the network 10. It is undesirable from a user's 
point of view to have too much audio delay introduced 
as a result of transporting audio from place to place on 
the network 10, Many audio programs rely on synchro- 
nization of many audio, video, and other parts of a pro- 
gram or presentation. Excessive delay causes sounds 
to not happen at the correct times, an aesthetically un- 
pleasant result, in addition, listening to an audio pro- 
gram for the purpose of monitoring its correctness is af- 
fected by audio delay, as even relatively small delays 
can cause unpleasant, unnatural perceived effects. (For 
instance, speaking into a microphone while listening to 
yourself on headphones with an audio delay of a few 
tens of milliseconds, causes the audio in your ears to 
be out of phase with the sound coming from your mouth, 
which is distracting and unpleasant.) 
[0026] Therefore, it is desired to be able to transport 
audio programs over packet switched networks with 
small enough latency (delay) in the audio so as to not 
produce these unwanted audio delay related effects. A 
numerical value for maximum acceptable delay to be re- 
garded as low latency" is less than 1 millisecond for 
each traversal of the packet switched network 10. 

Audio Buffering 

[0027] Because of the variation in time of the delivery 
of data packets of the packet switched network 10, the 
receivers C (Figure 1 A) must hold a certain amount of 
audio data ahead of time in a buffer. If the correct time 
to output each audio sample is regarded as a time dead- 
line, then the buffer holds the up and coming required 
audio data locally, so that the deadlines will be satisfied 
and no audio distortion can occur, The problem is that 
this local buffer in each receiver C directly adds latency 



EP.1 432 203 A2 



25 



30 



35 



40 



45 



50 



4 



7 



EP1 432 203 A2 



8 



to the audio, which is undesirable, and does not fulfil! 
the problem to be solved, of delivering low latency audio. 
[0023] In accordance with an exemplary embodiment 
of the invention Sow latency audio delivery is achieved 
by use of only just enough buffering chosen with a view 
to the particular characteristics of the packet switched 
network 10. This solution can be regarded as the mini- 
mum possible buffering for a given set of packet 
switched network characteristics. 

Use of network packet priority 

[0029] Switched packet networks, in particular, 
switched Ethernet, allow a packet priority value to b© 
assigned to each packet individually. When multiple 
packets are waiting in the queue to be sent, the switches 
22 use this priority value to determine the order that the 
packets are sent out on each link. Without priority, the 
packets are sent in simple first in, first out order. With 
priority, the switch assures that a higher priority packet 
is never made to wait behind a packet with lower priority. 
One can think of a switch with the priority mechanism to 
have multiple queues, one for each priority, since pack- 
ets belonging to the same priority level do queue behind 
each other. Seethe depiction of multiple queues 30a - 
30d shown in figure 2. 

[0030] In accordance with an exemplary embodiment 
of the invention, a network 10 carrying mixed types of 
traffic (audio and non-audio), audio packets are as- 
signed a priority value higher by a sender B than the 
non-audio data carrying packets. This guarantees that 
inside a switch 22, if there are any audio packets pend- 
ing, they will be sent before all non-audio packets. 

Queuing delay: with priority 

[0031] Assigning audio packets higher priority does 
not result in audio packets having no delay in the switch- 
es, since the case may happen that a switch 22 just be- 
gan to send a lower priority non audio packet at the mo- 
ment an incoming audio packet of higher priority arrived. 
Packet transmissions through a link are never interrupt- 
ed once started, so the high priority audio packet that 
just arrived will experience a delay corresponding to the 
transmission time of the largest possible non-audio 
packet size. The transmission time of the largest packet 
possible}* an important parameter of the exemplary em- 
bodiment of the invention for achieving low latency au- 
dio over a packet switched network with priority. This 
transmission time of the largest possible packet deter- 
mines the minimum additional time that each receiver C 
must hold audio data in its buffer, determines the mini- 
mum buffer size, and thus determines the minimum la- 
tency possible for end to end audio delivery through the 
links of the network. 



Determination of the minimum audio receive buffer size 

[0032] A time period, called the audio frame time pe- 
riod 100, is chosen as the fundamental interval of time 
at which packets of audio samples are communicated 
over the network. The smaller the audio frame period, 
the lower the end to end latency, but the higher the pack- 
et overhead, since sending even one audio sample re- 
quires the use of a minimum packet size. A choice is 
made to minimize the packet overhead, minimize the 
audio latency, and maximize the number of audio chan- 
nels (which is the number of audio packets, one packet 
per channel) the network can carry. Since the audio la- 
tency is also determined by the above described Queu- 
ing delay, ft is of little advantage to choose the audio 
frame period to be less than the Queuing delay. There- 
fore, in the exemplary implementation, the audio frame 
period is chosen to be 250us, about twice the queuing 
delay. This results in each audio packet carrying 12 au- 
dio-samples (sampled at 48khz,) 
[0033] The formula for the minimum buffer size at 
each receive channel is the sum of the audio frame time 
plus the transmission time of the largest possible packet 
times the number of intervening switches the audio path 
traverses. For the example of 100base T Ethernet, the 
maximum packet size is nearly 1500 bytes (ignoring the 
header and inter-packet gap which adds a few dozen 
more effective bytes), which means the maximum trans- 
mission time of the largest size packet is (1500 x 8 bits 
per byte) / 100,000,000 bits per second = 
1 20microseconds. Forthe example of digital audio data 
sampled at 48Khz, this means the minimum buffer size 
possible on a 100base T Ethernet packet switched net- 
work is 120us/(1/480QQ) = 5.75 rounded up to 6 audio 
samples per each switch the audio stream route passes 
through, plus the audio frame time. 
[0034] The size of the buffers in the receivers C for 
minimum audio latency are computed according to the 
above formula. In the example of the network 1 0 of Fig- 
ure 1 having a maximum number of two switches be- 
tween sender and receiver nodes, a frame time of 
250usec, or 12 audio samples at 48Khz sample rate is 
chosen. Therefore the buffer size (in terms of audio 
samples) is the audio frame plus two times the trans- 
mission time of a maximum sized packet, or: 12 + (2 x 
6) = 24 audio samples. 

[0035] Buffers for storing incoming audio data of this 
size are contained in the receiver nodes that can receive 
audio, • 

[0038] Note that Ethernet switches 22 which are 
standard commercially available devices have larger 
buffers for storing data, but for a different purpose. The 
Ethernet switch needs the larger buffers to Implement 
the priority scheme (and the queue construct of Figure 
2) set up by the priority bit (orbits) of an incoming packet. 
In the event that the switch 22 receives a higher priority 
packet that needs to be sent to a destination, any lower 
priority packets coming into the switch over other con- 



to 



15 



20 



25 



30 



35 



40 



45 



50 



B 



SftSDOCID: <EP ; 1432203A2J_: 



9 



BP 1 432 203 A2 



10 



nections must be buffered. 

Action of the receive audio buffers: recovery from non- 
audio packet 

[0037] One can refer to Figure 3 to understand how 
the behavior of the packet switched network. 1 0 with pri- 
ority packet designations allows such small buffers. Low 
latency is achieved, but the capacity of the number of 
audio channels of the network is not limited. Consider 
the behavior of the system when a maximum size lower 
priority non audio packet 120 (120 usee) is interposed 
into the audio stream and results in delay of the audio 
packets, 

[0038] Assume the number of audio channels almost 
fills up the entire capacity of the network bandwidth. Au- 
dio data is sent in packets holding a constant, chosen 
number of samples (chosen above to be 12), called the 
audio frame time 1 00 on the horizontal axis of Figure 3. 
The time left over is called the unused network band- 
width time 110, 

[0039] Consider a situation shown in Figure 3. At the 
very moment 122 a non-audio packet 120is starting to 
be sent, a large group of audio packets 1 24 arrive at the 
switch 22. All the audio packets 124 must wart for the 
non-audio packet before they begin to be sent. Notice 
that the next following group2 of audio packets 1 26 be- 
gin to arrive at the switch before the previous group (the 
group delayed by the non-audio packet), have been 
sent, This next group of packets 126 simply queue's up 
at the higher audio priority behind the previous audio 
packets in the progress of being sent. Note, that at the 
completion of the first group of audio packets there is 
no opportunity for a non-audio lower priority packet to 
be sent before the second group of audio packets, since 
at that moment the higher priority of the aiready present 
audio packets 126 precludes any lower priority trans- 
missions. Succeeding groups of audio packets continue 
to arrive before the previous audio packets have been 
sent, each groups being sent with less and less delay, 
by the incremental amount of the unused network band- 
width time 110. Eventually, after enough audio group 
times 'N', the switch 22 is 'caught up' with the pending 
audio packet transmissions, and there becomes a gap 
130 in the audio packet transmissions This then allows 
the next lower priority non-audio packet waiting Fn the 
low priority queue of the switch to be sent to the outgoing 
network link, and the above process repeats. The value 
of N is the quotient of the transmission time of the largest 
packet divided by the unused network bandwidth time 
[0040] An important fact to observe is that at no time 
is an audio packet delayed in transmission to a receiver 
C by more than the transmission time of one maximum 
sized non-audio packet, even when there were more 
non-audio packets waiting to be sent, so that the audio 
packets consume most of the available network band- 
widths V 



Network Timing 

[0041] At a network node 20 where analog audio sig- 
nals originate, the node 20 receives as input an analog 

s audio input 140 (See Figure 9), Digital audio is sampled 
from the original analog with a converter 1 42 that meas- 
ures the amplitude at regular intervals and passes this 
value (as a digital signal) on to the subsequent network 
node such as a node with a speaker as an output device 

to coupled to an audio output 144. When the digital signal 
needs to be turned back into analog, there is a reverse 
process performed by a converter that makes analog 
signals from the input numerical values. 
[0042] To reduce delay and ensure reliable audio, a 

15 common sampling clock must be used system-wide by 
nodes 20 on the network shown in Figure 1 . If each con- 
verter had an independent clock, the slight differences 
in the rate would mean that a buffer would be needed 
at the receiver, and even so, after some time the buffer 

so would eventually over or under-flow and the audio would 
be interrupted. 

[0043] In accordance with the invention, one terminal 
or node is designated to be the master clock source and 
implements a master clock 150 to which all the other 

25 nodes 20 are locked. (If the master clock is unplugged 
or fails, another node automatically takes its place in a 
seamless fashion.) A clock packet that contains a time 
value 152 is periodically sent by the source node but 
unlike the prior art patents referenced above this packet 

30 is not used to create time slots or to order the outputs 
of the transmitting terminals. Such control is not 
needed , because the invention uses switched Ethernet 
rather than a shared medium and has no need for timed 
access. The clock packet is not transmitted at the foe- 

35 ginning of a sequence of audio packets. Rather, it is 
transmitted at a much lower rate and a PLL (Phase 
Locked Loop) circuit at each of the nodes increases the 
rate to provide a synchronized audio sample clock in re- 
ceiving terminals or nodes. 

40 ■' 

Recovering digital audio synchronization 

[0044] The ability to recover digital audio synchroni- 
zation at multiple stations or nodes on the network relies 

45 on specialized statistical filtering of received times- 
tamped clock i nf ormation packets . Because packet 
switched networks can introduce a variable routing de- 
fay, a variable time delay is introduced into the commu- 
nication of timing information, which would cause a var- 

50 sable timing synchronization error in all receivers. How- 
ever, because the packet switched network can only add 
delay, it can never deliver a packet 'eariy 1 . This error is 
biased, and therefore can be mathematically filtered out. 
[0045] Any devices communicating digitized audio in- 

55 formation must operate off of an identical time base, or 
the digital audio information exchanged will not be able 
to be output, mixed, or otherwise combined with other 
audio channels. (A straightforward solution of using 



ESNSDOCJD: <EEP . 1. > 



11 

sample rate conversion for each audio data stream has 
the undesirable penalty of creating audio delay due to 
the buffenng required by the mathematical conversion 
filtering process,) Therefore, a desirable solution is to 
have a clock circuit in each device or receiver station 
which are all synchronized together to a common time 
reference. However, in order to synchronize clock de- 
vices, information must be communicated between 
them, allowing them to be adjusted to be synchronized. 
This synchronization information is itself sensitive to tim- 
ing errors, that is to say time delays in the communica- 
tion of synchronization information will prevent proper 
time synchronization. 

[0046] Packet switched networks have the property of 
delivery of packets of data with a variable time delay, 
dependent on the amount of network traffic. Since the 
network 10 transmits a mix of many types of packets 
from many senders to receivers at many different times, 
the switching delays experienced by clock packets are 
generally not precisely predictable, and have a variable, 
chaotic, and even a certain amount of random behavior. 
[0047] This switching delay, its magnitude, its varia- 
tions, normally prevents effective communication for 
use in synchronization of clocks , and is the fundamental 
problem to be solved in order to achieve node synchro- 
nization. 

[0048] Referring to Figure 4, in order to synchronize 
multiple clock devices, one device is chosen to be the 
master and implements a master clock 150, while all 
other devices become staves which must follow and 
synchronize to the one master by implementing a slave 
clock 154 Choosing which device will be the master 
may be a manual operation, or an automatic one deter- 
mined by a predetemined protocol exchanged via the 
communication ^network 10 in the event of a failure of 
the master. In <one exemplary process after a timeout 
delay ofxeceiving no clocks, the master clock 150 is as- 
sumed not functioning any longer, and every possible 
new master transmits a preliminary clock message, if 
there are more than one new clock master candidate, 
the candidates vote themselves off in favor of the master 
detected with highest merit. In this embodiment the 
master with highest merit is determined from an assign- 
ment of unique values to each device, for example, such 
a the lowest ethernet network address value. 
[0049] The master marks and communicates time ref- 
erence moments to all slaves, by a broadcast or multi- 
cast method. of addressing all slaves with one packet. 
This packet contains a time reference count, called a 
timestamp value 152. This timestamp value 152 is a 
measure of time made by the master clock device in ar- 
bitrary time units. It is important that the value 1 52 is to 
be of high enough resolution to allow very small time 
differences or errors to be calculated by the slaves. In 
the exemplary implementation, the timestamp is in units 
of 1/12,288,000 Hz (approximately 80ns). 
[0050] Once the measure of the local clock time is 
made by the roaster clock 1 50, the resulting data packet 



12 

(called a dock packet) is sent to the packet network 1 0 
for communication to all the slaves. Each slave, when it 
receives a clock packet, measures it own local clock de- 
vice 154, for comparison to the master clock reference 
^ value 152 communicated inside the packet. In order to 
synchronize the slave clock 154 to the master clock, 
successive comparisons between the master and slave 
clock values are made at the slave node, if the compar- 
ison value is getting larger over time, then the slave 
10 clock 154 is running too fast, and a rate control adjust- 
ment is made to slow the slave clock down, and vice 
versa if the slave clock is found to be running too slow, 
a rate adjustment is made to speed it up. The specific 
formulas used to calculate the amount of rate adjust- 
15 ment given the amount of observed comparison differ- 
ences overtime, may be many different standard control 
algorithms, including standard second order PLL 
(Phase Lock Loop), or PiD (Proportional Integral Differ- 
ential) control algorithms that are implemented in soft- 
ware. 

[00S1 ] if there was no variation in the delivery time of 
the clock packet via the packet network 1 0, then imple- 
mentation of this method alone would result in a perfect 
synchronization between the slave clock 154 and the 
master clock 1 50, besides the constant network transit 
delay, which could be measured and subtracted out. 
However, the variation of the clock packet delivery in- 
troduces an error in the measurements. 



[0052] In order to overcome the effects of the packet 
delivery time variation unknown *X\ some observations 
can be made of the value of X: X is always greater than 
35 or equal to 0. X can never be negative. This means that 
X represents biased error in the time communication, 
and therefore this bias may be filtered out using a math- 
ematical filter to eliminate it 

[0053] Another way of viewing this packet delivery 
40 time variation filter is to observe that the time variation 
comes from additional delay in the packet network which 
is a result of other packets traversing the packet network 
at the same time as the clock packet. Specifically, the 
time variation for the clock packets is the sum of the 
45 queuing delays in the switches 22 resulting from ail other 
packet traffic. We may assign the dock packets the high- 
est priority, (see-discussion above regarding priority as- 
signment), but there will still be the queuing delay 
caused by the sending of lower priority packets in 
so progress when the higher clock packet arrives in the 
switch. A key observation is that if at the moment the 
clock packet arrives in the switch there are no other 
packets in progress of being sent, then the clock packet 
will be sent out immediately with minimum delay. There- 
sa fore, at the receivers, over time some of the clock pack- 
ets will have arrived delayed by other packet traffic, and 
some will arrive not having been delayed by other pack- 
et traffic. By determining which clock packets had been 



EP1 432 203 A2 



30 Delivery time variation 



20 



25 



BNSDGCIC): <E:P ........ 5 432S03A2. J . 



13 EP1 i 

delayed, and which had not, the time measurements of 
the packets that had been delayed can be simply ig- 
nored, and local clock rate adjustment calculations 
made based solely on the non-delayed clock packets 
(that is the clock packets for which the time variation 'X' 
introduced above is zero or minimum. 

Determining which clock packets to use 

[0054] To determine which ciock packets have been 
subjected to queuing delay as they traversed the packet 
network, and which had not/the invention coiiects a set, 
or ensemble of ciock packets in each receiver C, 
[0055] The size of this set that must be collected is 
determined by the statistics of the traffic on the packet 
network in use. The size of the set must be large enough 
so that given the variations of delivery time; the proba- 
bility of at least some of the clock packets having been 
received without extra delay is significantly close to 1 . 
There at least two methods for calculating an estimate 
of this probability. 

[0058] Referring to Figure 5A, a first method esti- 
mates the probability is based on a determination of the 
ratio of network free time B as a percentage of all time 
B+A. Given the expected network traffic density, this 
method chooses the time interval of collecting the clock 
packets such that the probability of having network free 
time is greater than zero. 

[0057] For example, if the sum total traffic on the net- 
work is at 50% capacity, then roughly half the time a 
packetwill be in transit coincident with other packets and 
may see a delay, but the remaining portion of the time 
it may not. Packet traffic tends to be bursty, with time 
periods of high capacity volume, followed by Sow vol- 
ume. In these cases the typical time intervals of the 
bursts is more important than the measure of average 
network capacity used. 

[0058] A second probability estimate illustrated in Fig- 
ure 5B is derived from a property of the behavior of the 
packet Ethernet switch that determines the probability 
of high priority clock packets propagating through the 
Sinks of the network with minimum delay: The ratio of the 
desired definition of 'minimum 1 delay to the transmission 
time of a maximum size packet, From the moment a high 
priority clock packet arrives in the switch, it will be next 
to be sent out by virtue of its high priority, but must wait 
for any current packet then being transmitted to com- 
plete. If we define a 'minimum' delay to be say 1 usee, 
and the maximum packet is 120 usee Song (for 100bt 
Ethernet), then the probability over time that a clock 
packet will arrive less than 1 usee before the end of the 
previous packet transmission is complete is 1/120. 
Therefore even with network capacity at 100%, if we col- 
lect 1 20 clock packets, the odds will be close to certainty 
that at least one of the clock packets has experienced 
a delay less than 1 usee in the switch. 
[0059] For a packet switched network carrying digital 
audio traffic streams of some amount, say 80% capacity, 



2 203 A2 14 

plus command and control information for those digital 
audio devices, an exemplary system has a very high 
probability of some clock packets arriving with minimum 
variable delay by collecting between 50 and 250 clock 
s packets over an interval of 200mS33iseconds to 1 second. 

Histogram filtering 

[0060} Referring to figure 6, once the set of clock 
io packets is collected, observe a histogram 156 of the 
comparisons made between the master ciock and the 
local slave clock. It is observed that the set of time com- 
parisons will be spread from a minimum to a maximum 
value. Since both the master 150 and slave clocks 154 
15 are stable relative to real time (they only differ in rate), 
the variation may be attributed solely to the variable net- 
work delay, Therefore, the time measurement values at 
the minimum range of the histogram are the ciock pack- 
ets that experienced the minimum extra network delay. 
20 All other packets may be ignored, and the values from 
minimum end of the histogram are used to perform the 
slave clock rate adjustment calculation discussed pre- 
viously. 

[0061] In practice, since it is known ahead of time that 
25 at the end of the histogram process, the process only 
uses the minimum range value, it does not need to store 
the data for the entire histogram. Rather, it simply finds 
the minimum time difference value of the set of ciock 
packets as they arrive. 
30 [0062] The exemplary embodiment of the invention 
uses a novel design for transmitting timestamped clock 
references on packet switched networks allowing opti- 
mal clock synchronization recovery that is particularly 
advantageous for use with audio data transmission. The 
35 disclosed exemplary embodiment of the invention uses 
a process for sending timestamped ciock references, 
which optimizes clock recovery when using a statistical 
filtering synchronization scheme in each receiver. 
[0083] In order for clock synchronization using statis- 
ts tical filtering of clock packets to operate correctly, the 
probability of at least some clock packets arriving with 
minimum delay (i.e. no extra switch queuing delay) must 
be close to 1. This probability is an interaction of the 
characteristics of the network traffic, and the character- 
45 istics of when and how the clock packets are sent; The 
characteristics of the network traffic is regarded as out- 
side the control of the system (in order to not place con- 
straints on the system). The design of the transmission 
of clock packets is made to optimize the required high 
so probability. 

Design Requirements of the Transmission Pattern of 
Clock Packets 

55 [0084] Given that the delay that a packet switched 
network adds to any given packet is a function of the 
other traffic on the network, the delay statistics of the 
network are really the statistics of ail the other traffic on 



SNSDOCiD: <£P 1432203A2, J„> 



15 



EP1 432 203 A2 



16 



the network. 

Without attempting any overall media access control, or 
proscribing any overall restrictions ortraffic grooming on 
the overall traffic on the network, it must be assumed 
that the overall traffic pattern is arbitrary and random. 5 
Because the overall traffic patterns are arbitrary, there 
may indeed be traffic patterns that have pronounced re- 
petitive periodic pattern, bursts, or long streams of 
bursts, it cannot be assumed that overall traffic is sta- 
tistically 'random' in the sense of lacking structure, it may 10 
have pronounced, (but arbitrary) structure. 
[0065] For correct operation, the pattern of transmis- 
sion of timestamped clock reference packets is chosen 
so that at least some of the time the clock reference 
packets traverse the entire network to the intended re- *5 
ceivers with minimum delay. Note any given clock pack- 
et broadcast onto the network by the master does not 
have to reach a// receivers with minimum delay, it is suf- 
ficient that at least some of the time some clock packets 
reach each receiver with minimum delay. 20 

Network Traffic Patterns 

[0068] Referring to Figure 7, the network traffic is un- 
defined, but is not completely random. Two dominant 25 
traffic commonly appear that are characteristic of a wide 
class of data flows: 

Bursty Traffic Pattern 

30 

[0087] Bursty traffic is when a relatively large amount 
of data needs to be transferred, but only once. When 
thedata transfer is demanded, itmay take many packets 
of network transfer to complete the required data trans- 
fer, and these^ all complete with as minimum delay as 35 
possible. Therefore, a group of transfers happens to- 
gether (a burst), until the overall data request is com- 
plete, and then the network transfers stop. Network pro- 
tocols like TCP/IP have mechanisms to spread out 
these bursts somewhat, to promote sharing of the net- 40 
work even during large bursts. The characteristics of 
bursty network traffic are the statistics of the burst length 
(Bt), and the time gaps between the bursts, called the 
burst gap (Gt). 

[0068] In order to have a reasonable probability of at 45 
least some clock packets of a set traversing the network 
with minimum delay, the length of time covered by the 
set of clock packets C(set)t, should be greater than the 
maximum expected burst length time Bt Otherwise, all 
the clock packets of a set may be delayed by the existing so 
network burst. In practice, if the priority of the clock pack- 
ets is set higher than the bursty network traffic, than this 
constraint on the design of the clock packets set size 
may be relaxed. 

55 

Isochronous Traffic pattern 

[0063] Referring to Figure 8, Isochronous network 



traffic is when a certain amount of data is transferred 
periodically by the network. The 'iso-* name comes from 
the fact that these data transfers are not in exact syn- 
chronization with time, since the variable delay of packet 
network delivery prevents this. They are approximately 
periodic in time, having a period P and may continue to 
exist for extended or indefinite periods of time (that is, 
they may never stop). Many multimedia streams carried 
on networks form isochronous traffic patterns. 
[0070] Note that when a bandwidth sharing algorithm, 
such as TCP/IP controls a large burst transfer in order 
to throttle back and use less network bandwidth, it may 
for a certain duration, create a stream of packets, 
spaced out at some pseudo-interval. This is not true is- 
ochronous traffic, but it has the same potential for col- 
liding with and disrupting clock packets. 
[0071] The isochronous traffic pattern has the great- 
est potential to disrupt the communication of synchroni- 
zation information over the packet network. This is be- 
cause the potential exists that any isochronous stream 
may happen to have the same or similar interval as cho- 
sen for a given set of clock packets communicated for 
the purpose of clock synchronization. In this case, even 
if the clock packets are set to a higher priority than the 
isochronous stream, each and every clock packet may 
still experience queuing delay in a switch 22, since at 
each and every moment a clock packet arrives at the 
switch, a packet from some isochronous stream may 
have just started transmission. This 'accidental correla- 
tion* between the isochronous streams and the clock 
packet sequence period Gpis avoided by practice of the 
present invention. 

Clock Packet Transmission Pattern Solution: 

[0072] Any regular, periodic pattern of transmission of 
timestamped clock references is disqualified since it 
may run into conflict with one of the arbitrary overall ex- 
isting isochronous traffic patterns. 
[0073] A pattern of random intervals 1 70, or sufficient- 
ly pseudo-random interval pattern is chosen for the 
transmission of the timestamped clock reference pack- 
ets. Statistically, this ensures on the whole, that at least 
some of the clock reference packets will reach each re- 
ceiver free of conflict from overall other traffic, and fulfill 
the requirements for allowing clock synchronization re- 
covery. These random intervals are determined in soft- 
ware or hardware by the node designated as the master 
and at the beginning of each such interval, the node 
broadcasts a timestamped clock packet onto the net- 
work 10. 

Node functional block diagram 

[0074] There are two signal flow paths represented in 
a node 20 depicted in Fig 9, receive and transmit. The 
receive path flows from top to bottom, and the transmit 
is bottom to top. 



9 



BN8DCX310: <EP .1 432203A2 ... ! . > 



17 

Receive Path: 

Ethernet PHY 

[0075] The receive path begins with packets coming 
in from the ethernet network via the Ethernet Physical 
interface 210. This devise transcodes the particular 
electrical, wireless, or optical signal format used for 
transmission between nodes, into standard digital logic 
signals. The Ethernet physical interface 210 presents 
the data of the incoming packets to a packet receiving 
circuit. 

Packet Filter 

[0078] A packet filter 212 tests the data in each re- 
ceived packet of data to see if it belongs to one of the 
audio streams, or contains clock sync information; or 
not. If neither audio nor a clock packet, the packet either 
represents non-audio data for that node oris addressed 
to another node. If the packet contains non-audio data 
a node processor interprets that data in a conventional 
manner. The packet filter does this by comparing the 
destination address contained inside the data packet, 
with a list of destination addresses that the receiving ter- 
minaiis programmed to accept. The list of accepted des- 
tination address numbers is programmed by a node 
processor 213 into the packet filter ahead of time de- 
pending on which audio channels from the network the 
user desires to come out of the outputs of this audio re- 
ceive terminal . if the packet address does not match any 
of the accepted destination addresses on the list, no fur- 
ther action is taken on that packet and it is simply ig- 
nored. If the packet address does match an accepted 
address on the list, which address It matches deter- 
mines the next step of processing the incoming packet. 

Clock packets: 

[0077] If the destination address matches the address 
for clock packets, then a time measurement of the local 
clock 214 is triggered, and the local time clock value 
along with the received clock packet contents is stored. 
This storage event notifies the software running on the 
node processor that a new clock packet has arrived. 
Software on the node processor reads the clock packet 
information and compares the local clock to the remote 
master clock by performing a histogram statistical clock 
filtering algorithm, The clock filtering algorithm may re- 
suit in a decision to adjust the local dock to make this 
local clock 214 either faster or slower using a software 
implemented phase lock loop 218. 

Audio Packets 

[0078] If the packet destination address matches one 
of the audio channel addresses on the list, then that 
packet is routed and stored into a corresponding audio 



18 

channel buffer 220. That is, if the audio packet address 
matches the first audio channel address on the list, then 
the audio data is put into the first audio channel buffer, 
matching the second address on the list goes into the 

^ second audio channel buffer, and so forth The audio 
channel buffers 220 are maintained in FIFO order, and 
read out at a periodic rate determined by the local sam- 
ple clock, serialized, and sent to the Digital to Anaiog 
(D/A) converter 222 to be converted into an analog au- 

io dio signal output 1 44 (or sent to an AES/EBU transmitter 
to become a standard digital audio signal). 

Effects of Clock Synchronization 

15 [0079] Note that if the local sample clock is running 
faster than the remote master clock, the audio channel 
buffer will be emptied by the D/A converter 222 faster 
than it is filled from network audio packets, which results 
in underflow and an interruption of the audio. Likewise 

20 if the local sample clock is running slower than the re- 
mote master clock, the audio channel buffer will become 
full, resulting in overflow and likewise a loss of audio da- 
ta. Both of these conditions are avoided by the proper 
synchronization of the local clock 214 to the remote 

25 master clock 150 so that the net empty and fill rates of 
the buffers is the same. 

Receive Channel Buffer Initialization 

so [0080] Also note that the receive audio channel buff- 
ers 220 must be properly initialized so that they contain 
the chosen average amount of audio data correspond- 
ing to the buffer size outlined previously. The maximum 
capacity of the FIFO is not the buffer size we desire (for 

3S the example of 24 audio samples outlined), What is re- 
quired is 24 You are correct, Fig 9 is wrong. The 'N input 
channel buffers' should show nominally 24 samples 
each. The 'N output channel buffers 1 should remain 
nominally 12 samples each)) audio samples contained 

^0 in the FIFO at the moment of the beginning of an audio 
frame period. The maximum capacity of the Fl FO mem- 
ory may be any number larger than the required buffer 
size, and is not an important parameter of this design. 
[0081] One of at least two methods may be used to 

45 initialize the receive FIFO audio buffers 220. The first 
method is to empty the buffer, while disabling the output. 
Then, after 24 samples (2 nominal audio frames) have 
come in from the network, enable the output. The sec- 
ond method is to directly manipulate the internal FIFO 

so memory storage pointers. At the moment the FIFO be- 
gins to be filled, set the output pointer equal to the input 
pointer minus 24 audio samples (or alternatively at this 
moment set the input pointer to the output pointer plus 
24 audio samples). Both of these methods will initialize 

55 the received audio channel buffer FIFO to have nomi- 
nally the chosen buffer occupancy size. The receive 
channel buffer is implemented in certain nodes using a 
field programmable gate array (FPGA) commercially 



EP 1 432 203 A2 



10 



BNSDCCfD: <£P 1432203A2 i > 



19 



EP 1 432 203 A2 



20 



available from Xiiinx. it includes memory for the buffers 
and programmable logic for maintaining those buffers. 
Other nodes, such as PC based nodes implement these 
buffers completely in software that interaces with a 
standard network interface card. 

Transmit Path: 



[0082] Transmit data originates from the Analog to 
Digital converters 142 (A/D) transcoding analog audio 
into digital numerical values (or digital numerical values 
may be received directly from AES/EBU digital audio re- 
ceivers. This data is received serially, converted to par- 
allel by a converter 224 and stored into an appropriate 
transmit audio channel buffer 230, The transmit audio 
channel buffers collect enough audio samples to form a 
complete audio packet, (In the exemplary embodiment 
this is the data for 12 audio samples). When there Is 
enough data in the buffer for an audio packet, the packet 
transmit is triggered , The packet generator takes the au- 
dio data out of the channel buffer and builds an audio 
packet, adds the packet header information, computes 
and adds a CRC check value to the end, and sends the 
packet to the Ethernet physical interface 21 0. When the 
audio packet is created, the audio data from channel 
buffer 1 is given the packet destination address for the 
first output audio channel, buffer 2 is given the address 
for channel 2, and so forth. The destination addresses 
are determined by the node processor software ahead 
of time and programmed into the packet generator, as 
the user configures how the audio channels are to be 
configured for routing. 

[0083] The Ethernet physical Interface 210 trans- 
codes the packet data into signaling to the network con- 
nection (wires,, wire! ess, or fiber optic). 
[0084] Note that since both the timebase of the gen- 
eration of the audio data from the A/D converter 1 42 and 
the timing of determining when rt is time to send the au- 
dio packets to the network, are both determined from 
the local sample clock 214, the buffer synchronization 
of the transmit mechanism is much simpler than the 
mechanism for initializing the buffers for receive. It is 
sufficient to simply wait for the transmit buffers to be full 
enough, and then transmit audio packets 
[0DS5] While the invention has been described with a 
degree of particularity, it is the intent that the invention 
include all modifications and alterations falling within the 
spirit or scope of the appended claims. 



Claims 



10 



f5 



20 



25 



30 



35 



40 4, 



45 



50 s. 



A process of delivering audio signals from a source 
node to a destination node on a network comprising 
the steps of: 55 

providing a number of switches that transmit 
prioritized data on a data network; and 



coupling the switches to a number of nodes for 
sending and receiving digital audio signals on 
the data network; at least some of said nodes 
having a receive buffersized to hold an amount 
of audio data samples time the sampling period 
approximately equals the network transmission 
time of one maximally sized network packet, 
per the number of intervening switches, in order 
to minimize delays in processing audio signals 
arriving at said node. 

The process of claim 1 additionally comprising the 
step of assigning a priority to data packets at a 
source node based on whetherthe packet is an au- 
dio or non-audio packet, audio packets being as- 
signed higher priority, for the purpose of causing 
switches interposed between nodes to transmit 
packets through said switch that are received from 
a source node based on the priority of said data 
packets. 

Apparatus for delivering audio signals from a 
source node to a destination node on a network 
comprising: 

a number of switches that transmit prioritized 
data on a data network; and 
a number of send/receive nodes for sending 
and receiving digital audio signals on the data 
network; at least some of said nodes having a 
receive buffer sized to hold an amount of audio 
data so that the amount of time represented by 
the audio sample data approximately equals 
the network transmission time of one maximally 
sized network packet, per the number of inter- 
vening switches, in order to minimize delays in 
processing audio signals arriving at said re- 
ceive node. 

The apparatus of claim 3 additionally comprising a 
packet generator in a source node for assigning a 
priority to data packets at said source node based 
on whether the packet is an audio or non-audio 
packet, audio packets being assigned higher prior- 
ity, for the pu rpose of directing the switches to trans- 
mit packets through said switch that are received 
from a source node based on the priority of said da- 
ta packet. 

A process of synchronizing events on a network by: 

maintaining a master clock at a specified node 
of a network of interconnected nodes; 
at intervals encoding a timing packet for trans- 
mission to one or more other nodes on the net- 
work; and 

determining the timing packets received with 
the least timing error by finding the minimum 



11 



21 



EP 1 432 203 A2 



22 



network transit time between the master node 
and the one or more other nodes, by finding the 
minimum time offset from a set. or group of, 
multiple timing packets; and 
then using only the timing packets received with 5 
least timing error to synchronize the local clock 
maintained at the one or more other nodes, to 
8 the clock at said master node. 

. The process of claim 5 wherein the step of synchro- 10 
nizing is performed at each node by a digital phase 
lock loop that takes the time comparison informa- 
tion from the timing packets received with least tim- 
ing error introduced by the packet network transit 
delay time, and computes the rate control adjust- 15 
ment to bring the local clock at the node into syn- 
chronization with the master clock. 

7. 

The process of claim 5 where the specified master 
node sends out a multiplicity of timing packets onto 20 
the network at irregular or pseudo-random intervals 
to increase the probability that at least some of 
some timing packets arrive at the one or more other 
nodes with a minimum network transit delay time. 

8. 25 
Apparatus for synchronizing events on a network 

comprising: 

a specified node of a network of interconnected 
nodes including a digital circuit for implement- 30 
ing a master clock that encodes, at intervals a 
timing packet for transmission onto the a net- 
work; and 

one or more other nodes on the network having 
a processor for determining the timing packets 35 
received with a least timing error by finding the 
minimum network transit time between the 
master node and the one or more other nodes, 
by finding the minimum time offset from a set, 
or group of, multiple timing packets, then by us- '-40 
ing only the timing packets received with least 
timing error to synchronize the local clock to the 
master clock. 

9„ The apparatus of claim S wherein the one or more *5 
nodes comprises a digital phase lock loop that takes 
the time comparison information from the timing 
packets received with least timing error introduced 
by the packet network, and computes the rate con- 
trol adjustment to bring the local clock at the node 50 
into synchronization with the master clock. 

10. The apparatus of claim 8 where the specified mas- 
ter node sends out a multiplicity of timing packets 
onto the network at irregular or pseudo-random in- 55 
tervals to increase the probability that at least some 
of some timing packets arrive at the one or more 
other nodes with a minimum network transit delay 



time. 

11. A process of delivering audio signals from a source 
node to a destination node on a network comprising 
the steps of: 

providing one or more switches that transmit 
prioritized data on a data network; and 
coupling the switches to a number of send/re- 
ceive nodes for sending and receiving digital 
audio signals on the data network; at least 
some of said nodes having a receive buffer 
maximally sized -to hold audio data samples 
having a period which approximately equals a 
network transmission time of one maximally 
sized network packet between the nodes, in or- 
der to minimize delays in processing incoming 
audio signals arriving at a node, 

12. The process of claim 1 1 additionally comprising the 
step of assigning a priority to data packets at a 
source node based on whether the packet is an au- 
dio or non-audio packet, audio packets being as- 
signed higher priority. 

13„ The process of claim 11 wherein the send/receive 
nodes receive buffer size is based on the number 
of network links said maximally sized packet must 
traverse in the network from a source node to a des- 
tination node. 

14. Apparatus for delivering audio signals from a 
source node to a destination node on a network 
comprising: 



a number of switches that transmit prioritized 
data on a data network; and 
a number of send/receive nodes including a 
programmable processor for sending and re- 
ceiving digital audio signals on the data net- 
work; at least some of said nodes having a re- 
ceive buffer sized to hold an amount of audio 
sample data such that the amount of time rep- 
resented by the amount of audio sample data 
held in said receive buffer approximately 
equals the network transmission time of one 
maximally sized network packet between send/ 
receive nodes to minimize delays in processing 
audio signals arriving at a receive node. 

15. The apparatus of claim 14 additionally comprising 
a processor included in a source node for assigning 
a priority to data packets at said source node based 
on whether the packet is an audio or non-audio 
packet, audio packets being assigned higher prior- 
ity 

1 8. The apparatus of claim 1 4 wherein the send/receive 



12 



1432203A2f..S..> 



23 EP 1 432 203 A2 24 

nodes contain a memory for storing data in the re- 
ceive bufferthat is based on a maximum number of 
links between a source node and a destination node 
on the network. 



15 



20 



25 



30 



3$ 



40 



45 



55 



13 

SNSDOOiD: «ER_ 1432203A2.L .> 



EP 1432 203 A2 




BNSiXOCiD: <EP 1432203A2 I > 



EP 1 432 203 A2 




BNSDOCiD: «EP. 



1432203A2_!_> 



EP 1 432 203 A2 



RECEIVE AUDIO 
GROUP 1 



RECEIVE AUmG 
GROUP 2 



AUDIO GROUP 1 



AUDIO GROUP 2 



SWITCH 
OUTPUT 



I 



AUDIO ' PACKETS 
II I I I I I I 



AUDIO PACKET: 
I I I 



UNUSED NETWORK 


^—110 


TIME 




BANDWIDTH TIME 








AUDIO FRAME TIME _ 
(e.g. 250us) f^lOO 





RECEIVE 
NON-AUDIO 



120 
122 



SWITCH V" 
OUTPUT > 



RECEIVE AUDIO 
GROUP 1 



120 



RECEIVE AUDIO 
GROUP 2 



AUDIO GROUP 1 



AUDIO GROUP 2 



NON —AUDIO PACKET 



MM I I i 

AUDIO PACKET!. 

I I U I LU 



MM i I I I 

AUDIO PACKETS 

MMMM 



> 



SWITCH 
OUTPUT 



> 



J 



TRANSMISSION TIME OF 
THE LARGEST PACKET(eg 1 20us) 



RECEIVE AUDIO 
GROUP N 



TIME. 



RECEIVE AUDIO 
GROUP N-1 



AUDIO GROUP N 
. i 



AUDIO GROUP N+1 
- I _^ 



TTTTTTTT 

AUDIO PACKETS 

M f M M I 



(E) 



FIg.3 



TTTTTTTT 

AUDIO PACKETS 

U l LLJ I ' 



30 



16 



BNSDOCID: <EP 



1 432203 A2 i > 



EP1 432 203 A2 




17 

BNSDOCID-. <£?.,....;, 1432203A2..f..» 



EP 1 432 203 A2 



150 



MASTER CLOCK 

Q 




152 



TIME i 
VALUE/ 



CREATE CLOCK PACKET 



2 



DELIVERY Itmi Z 
TifcE VARIATION X 



> 




154 



SLAVE CLOCK 



/ time fr~y 
VALUE 




Fig.4 



So 

Lug 

055 
a: 0 

CD □ 



_±Q 



-156 



n 



MINIMUM DELAY 
VALUE, USED FOR 
RATE ADJUSTMENT 
CALCULATIONS. 



TIME DIFFERENCE 
BETWEEN LOCAL 
CLOCK AT RECEIVED 
MOMENT AND RE- 
CEIVED TMESTAMP. 



Ft — 



PACKETS THAT 
•EXPERIENCED 
MODERATE DELAY 



PACKETS THAT 
EXPERIENCED 
MORE DELAY. 



PACKETS THAT 
EXPERIENCED 
LARGE DELAY. 



18 



8NSDOCSG: <£P.„, ; „_„ 



EP 1 432 203 A2 




BNSDOC.D: <EP 



1432203A2J„> 



EP1 432 203 A2 




2 



EP 1 432 203 A2 



NETWORK 
JACK 



NETWORK 
PHYSICAL - 
INTERFACE 
(PHY) 



-210 



Rg.9 



PACKET INTERFACE 



,212 



.-213 



PACKET GENERATOR 


PACKET 


FILTER 


CLOCK 
PACKETS 




STATISTICAL 
FILTER 


PACKER 


STRIF 


a PER 







N OUTPUT 
CHANNEL 
BUFFERS 



(NOMINALLY 
12 SAMPLES 
LONG. EACH) 



224 



142 



.230 



220 



AUDIO 
PACKETS 



ZL N INPUT ' 
CHANNEL 



SERIAL TO 
PARALLEL 



7* 



CLOCK 
CONTROL PLL 



216 



BUFFERS 1 



(NOMINALLY 
24 ■ SAMPLES 
LONG, EACH) 



PARALLEL 
TO SERIAL 



ANALOG TO 
DIGITAL 



N 

AUDIO 
INPUTS 




140 



DIGITAL TO 
ANALOG 





J 44 
/ N 



AUDIO 
OUTPUTS 



DIFFERENT TYPES OF AUDIO INTERFACES, e.g. 
ANALOG, AES/EBU DIGITAL, DIRECT TO DSP. 



1432203A2_j_> 



21 



(19) 



(12) 




fiestas «rt 



European 



Office europ&eR 
des brevets 



(11) EP 1 432 203 A3 
EUROPEAN PATENT APPLICATION 





Date of publication A3: 


(51) IntCi,: 




05.12.2007 Bulletin 2007/49 


HQ4L 29/06 H04M 7/00 W 0 *- 0 *) 






H04L ^2/5& i20Q9 ' 01} 


(43) 


Date of publication A2: 






23.05.2004 Bulletin 2004/26 






Application number: 0302S357.9 




(22) 


Date or filing: 04.1 1 .2003 




(84) 


Designated Contracting States: 


(72) Inventors: 




AT BE BG CH CY CZ DE DK EE ES Fi FR GB QR 


* Shay, Gregory F, 




HU IE ST LS LU f¥!G NL FT RO SE Si SK TR 


Mentor, OH 44060 (US) 




Designated Extension States: 


• Church, Steven 




AL LT LV MK 


Avon Lake, OH 4401 2. (US) 


(30) 


Priority: 17.12.2002 US 433922 P 


(74) Representative: Henkel, Feller & HinzeS 




03.04.2003 US 406396 


Pat@nt@nwllt@ 


(71) 




Maximllianspiatz 21 


Applicant: TLS Corporation 


80333 MOnehen (DE) 




Cleveland, Ohio 44114 (US) 



(54) Low latency dsgfia! audio over pack©! switched networks 



< 
© 

CM 
CM 

m 



(57) Method and Apparatus for delivering audio sig- 
nals from a source node to a destination node on a net- 
work, The apparatus uses a number of switches that 
transmit prioritized data on a packet network. The switch- 
es are coupled to a number of send/receive nodes for 
sending and receiving digital audio signals on the data 
network. The audio packet size and the receive buffers 
are sized to store a minimum possible number of audio 
samples to minimize latency In processing audio signals 
arriving at said receive node, but still ensure audio deliv- 
ery without interruption due to packet data network delay. 
An additional feature of the invention is recovery of clock 
synchronization over the same data network by novel 
arrangement of transmission of timing packets on the 
network. By sending a multiplicity of packets at irregular 
intervals a minimum network transit delay can be deter- 
mined by each of the receive nodes which allows the 
receive nodes to filter out packet network transit delay 
error and maintain accurate local docks. 



AUQtO GROUP 



AUC'IO GSOUP 2 



LLLLj-fjJLIJ / 



UNUSED NETWORK 




*UDIG RiAMt time: X. _ 



AUDIO GROUP 1 



AUDIO OftOJP 2 



(■TrrT"n"TT 

NON-AUO» PACKET AUDC PACKETS 
1 { I 1 1 M i \ 



M « j M j i 



TRANSMiSS»N TSME OF 
THE UV^GEST PACKET( eg 120us> 



R£CDV£ AULXO 



AUDIO GROUP H 



ASjDfG GROUP 



ch T" 

w 2 



jJULXLLLi 



Rg.3 



■ in llj i \ t 



Printed by Jouve, 75001 PARIS (FR) 



BNSDOCi 0: <E;P 1 .! . > 



EP 1 432 203 A3 




European Patent 
OffSca* 



EUROPEAN SEARCH REPORT 



Application Number 

EP 03 02 5357 



DOCUMENTS CONSIDERED TO BE RELEVANT 



Category 



Citation of document with indication, where appropriate, 
— — gj^jgv^^gggggges 



Relevant 
to claim 



CLASSJFtCATiQMOF THE 
APPLICATION (8PC) 



HOLMEIDE SKEIE: "VOIP DRIVES REALTIME 
ETHERNET " 

INDUSTRIAL ETHERNET BOOK, FIELDBUS, 
TITCH FIELD, GB, 

vol. 5 9 March 2001 (2801-03) 9 pages 26-29, 

XPO09083398 

ISSN: 1470-5745 

* page 1, left-hand column, line 1 - page 
3, middle column, line 3 * 



I- 4, 

II- 16 



INV. 

H84L29/06 

H84M7/00 

H04L12/56 



SKEIE T ET AL: "The road to an end-to-end 

deterministic ethernet" 

FACTORY COMMUNICATION SYSTEMS, 2002. 4TH 

IEEE INTERNATIONAL WORKSHOP ON AUG 28 - 

30, 2002, PISCATAWAY, NJ, USA, IEEE, 

28 August 2082 (2002-08-28), pages 3-9, 

XP010623287 

ISBN: 0-7803-7586-6 

* abstract * 

* page 4, left-hand column, line 1 - page 
6, right-hand column, line 11; figure 3 * 

KARAM M J ET AL: "Analysis of the delay 
and jitter of voice traffic over the 
internet" 

PROCEEDINGS IEEE INF0C0M 2081- THE 

CONFERENCE ON COMPUTER COMMUNICATIONS. 

20TH. ANNUAL JOINT CONFERENCE OF THE IEEE 

COMPUTER ANDCOMMUNI CATIONS SOCIETIES. 

ANCHORAGE, AK, APRIL 22 - 26, 2001, 

PROCEEDINGS IEEE INF0C0M . THE CONFERENCE 

ON COMPUTER COMMUNI , 

vol. VOL. 1 OF 3. C0NF. 20, 

22 April 20Q1 (2801-04-22), pages 824-833, 

XP010538768 

ISBN: 0-7803-7016-3 

* page 827, right-hand column, line 15 - 
page 831, right-hand column, line 16 * 



I- 4, 

II- 16 



TECHNICAL FIELDS 
SEARCHED (iPC) 



I- 4, 

II- 16 



H04L 
H04M 



-/-- 



The present search report has been drawn up for all claims 



Ptase ot search 

Munich 



Date ot completion of the search 

29 October 2007 



Tager, Wolfgang 



CATEGORY OF CITED DOCUMENTS 

X : partsouloriy retevant if taken alone 

Y : partcu&jly refavani ft combined wfth another 

dcaument <& th& same a&Sayery 
A : teohnobgtcaS background 
O : non-written dtsofasure 
P : intermediate document 



T ; theory or principle underlying the invention 
E : earlier patent document, bu&pub&shed on, or 

after the fSing dote 
D : document cited in the application 
L : document sited for other reasons 

& : member of the same patent family, oarre spanning 
document 



2 



BNSDOC3D: <EP 1*322Q3A3J^> 



EP 1 432 203 A3 




European Patent 
Office 



EUROPEAN SEARCH REPORT 



Application Number 

EP 03 02 5357 



DOCUMENTS CONSIDERED TO BE RELEVANT 



Category 



Citation of document with indication, where appropriate, 
_pf reievant passages 



Relevant 
to claim 



CLASSSHCATIQN OP THE 
APPLICATION (IPC) 



EP 1 006 686 Al (CIT ALCATEL [FR]) 
7 June 2098 (2080-06-07) 

* paragraphs [8888] , [0825] * 

US 6 438 782 Bl (HODGE JAMES E [US]} 
28 August 2902 (2092-88-28) 

* column 5, line 3 - line- 29 * 

HILLS D L: "MEASURED PERFORMANCE OF THE 
NETWORK TIME PROTOCOL IN THE INTERNET 
SYSTEM" 

NETWORK WORKING GROUP REQUEST FOR 

COMMENTS, no. 1128, 

October 1989 {1989-19), pages 1-11,1* 

XP002978984 

* page 15, paragraph 2 * 

* page 18 * 



5-18 



5-18 



5-18 



TECHNICAL RELDS 
SEARCHED (IPC) 



The present search report has been drawn up for all claims 



Place of search 

Munich 



Date of completion of the search 

29 October 2007 



Tager, Wolfgang 



CATEGORY OF CiTED DOCUMENTS 

X : particularly retevant if token aione 

Y : partieu lariy relevant if combined with another 

dooumenl of the same category 
A : technological background 
O : n©n-wntten dfeclcsure 
P : intermediate document 



T : theory or prfnopte underlying the invention 
E : earlier patent document, but published en, or 

after the fifing date 
D : document cited in the application 
L : document cited I cur other reasons 

& : member of the same patent family, corresponding 
document 



3 



BNSDOCiD: <EP 1432203A3 J. > 



EP 1 432 203 A3 



3 



European Patent Application Number 

Offi0 * EP 03 02 5357 



CLASPS INCURRING FEES 



The present European patent application comprised at the time of filing more than ten claims. 

□ Only part of the claims have been paid within the prescribed time limit The present European search 
report has been drawn up for the first ten claims and for those claims for which claims fees have 
been paid, namely claim(s): 




LACK OF UNITY OF INVENTION 



The Search Division considers that the present European patent application does not comply with the 
requirements of unity of invention and relates to several inventions or groups of inventions, namely: 



see sheet B 



AH further search fees have been paid within the fixed time limit The present European search report has 
been drawn up for all claims. 



□ As ail searchable claims oould be searched without effort justifying an additional fee, the Search Division 
did not invite payment of any additional fee. 

□ Only part of the further search fees have been paid within the fixed time limit, The present European 
search report has been drawn up for those parts off the European patent application which relate to the 
inventions in respect of which search fees have been paid, namely ciaims: 



□ None of the further search fees have been paid wrfihin the fixed time limit The present European search 
report has been drawn up for those parts of the European patent application which relate to the invention 
first mentioned in the claims, namely claims: 



SNSDCCiD: <EP 14322G3A3 I > 



4 



EP 1 432 203 A3 



European Patent 
Office 



LACK OF UNSTY OF INVENTION 
SHEET B 



Application Number 

EP 03 92 5357 



The Search Division considers that the present European patent application does not comply with the 
requirements of unity of invention and relates to several inventions or groups of inventions, namely: 

.1. claims: 1-4, 11-16 

Provide low latency digital audio over ethernet. 



2. claims: 5-10 



Provide synchronisation over a network. 



B&SDOCID: <EP „.1432203A3..l-> 



EP 1432 20 A 



AN THE EUR AM EA C O 

NEXTO OPE S R H REP RT 

ON EUROPEAN PATENT APPLICATION NO. 



EP 83 02 5357 



This annex lists the patent family members relating to the patent documents ctted in the above-mentioned European search report. 
The members are as contained in the European Patent Office ED P fiie on 

The European Patent Office is m no way liable for these partscttiars which are mereiy gsven for the purpose of information 

29-10-2(307 



Patent document 
cited in search report 



Publication 
date 



Patent family 
members) 



Publication 
date 



EP 1696686 



Al 



07 -06-2000 



AT 303022 T 

DE 69926857 Dl 

DE 69926857 T2 

FR 2786964 Al 

JP 200O174821 A 

US 6819685 Bl 



15- 09-2805 
29-09-2005 

08- 06-2806 

09- 06-2000 
23-06-2000 

16- 11-2004 



US 6438702 



Bl 



20-08-2002 NONE 



For more details about this annex : see Official Journal of the European Patent Office, ?\kx 12/82 



6 



-3 <WV»*t Jl^l i 



